Voice activity detection in degraded speech using excitation source information
نویسندگان
چکیده
This paper proposes a method for detection of voiced regions from speech signals collected in noisy environment. The proposed method is based on the characteristics of excitation source of speech production. The degraded speech signal is processed by linear prediction analysis for deriving the linear prediction residual. Hilbert envelope of the linear prediction residual is processed using covariance analysis to obtain coherentlyadded covariance signal. The periodicity property of the coherently added covariance signal is exploited to detect the voiced regions using autocorrelation analysis. The performance of the proposed voice activity detection algorithm is evaluated under different noise environments and at different levels of degradation.
منابع مشابه
A New Algorithm for Voice Activity Detection Based on Wavelet Packets (RESEARCH NOTE)
Speech constitutes much of the communicated information; most other perceived audio signals do not carry nearly as much information. Indeed, much of the non-speech signals maybe classified as ‘noise’ in human communication. The process of separating conversational speech and noise is termed voice activity detection (VAD). This paper describes a new approach to VAD which is based on the Wavelet ...
متن کاملExtraction of Excitation Information from Speech and Its Applications for Expressive Speech Processing
Through speech production mechanism, speech with different voice qualities such as phonations, emotions, expressive singing and other paralinguistic sounds are also produced. Most of these sounds demonstrate these features mostly due to the excitation component (vibration of the vocal folds at the glottis) whereas the dynamic vocal tract system primarily conveys the message. Hence, the excitati...
متن کاملAudiovisual speech source separation: a regularization method based on visual voice activity detection
Audio-visual speech source separation consists in mixing visual speech processing techniques (e.g. lip parameters tracking) with source separation methods to improve and/or simplify the extraction of a speech signal from a mixture of acoustic signals. In this paper, we present a new approach to this problem: visual information is used here as a voice activity detector (VAD). Results show that, ...
متن کاملEnhancement of speech in multispeaker environment
In this paper a method based on the excitation source information is proposed for enhancement of speech, degraded by speech from other speakers. Speech from multiple speakers is simultaneously collected over two spatially distributed microphones. Time-delay of each speaker with respect to the two microphones is estimated using the excitation source information. A weight function is derived for ...
متن کاملEvaluation of Glottal Epoch Detection Algorithms on Different Voice Types
According to the source-filter model of speech production, speech can be represented by passing the excitation signal through the vocal tract filter. The epoch or instant of maximum excitation corresponds to the glottal closure instant. Several speech processing applications require robust epoch detection but this can be a difficult task. Although state-of-the-art epoch estimation methods can p...
متن کامل